####################################################################

#####    Replication package for                               #####
#####    "A dynamic ordered logit model with fixed effects"    #####

####################################################################

This folder contains all the codes needed to replicate all tables in the paper and the appendix.

We strongly advise preserving the folder structure provided to ensure that the code runs smoothly. Therefore, please copy the provided files into the main folder and include the datasets (original files and the file "deflator.xls") into a new folder called "data". This way, the user can replicate all tables using the files described below.

The data were made available to us by Eurostat under Contract RPP 132-2018-EU-SILC. Due to data sharing restrictions, the original dataset employed, EU-SILC longitudinal 2005-2016 data, cannot be provided directly. However, they can be easily obtained (for free) by signing the required data protection clauses/documents. 

Access to EU-SILC data was provided by the Eurostat Microdata Access Team by permission. To access restricted data, researchers will need to place an access request with Eurostat (see https://ec.europa.eu/eurostat/web/microdata). From our experience, requests are approved very swiftly.

Assuming all the data are in place, the results can be generated by running the scripts in the following order:

###########      Results based on EU - SILC data        ############    

- Step 1. The script "1-DatasetCreation-part1.do" generates the estimation dataset from the raw files. It generates an output file (dataset.dta or dataset.csv). The underlying EU-SILC source files can be obtained from Eurostat (see above for availability status).
  
- Step 2. The script "2-DatasetCreation-part2.Rmd" uses file "dataset.dta" from the previous step and an output file named "balanced-data", which can be used to construct Tables 1 and 6 in the main text and Tables 1 and 2 published in the Appendix (see next code file). 

- Step 3. The script "3-Analysis-part1.Rmd" uses file "balanced-data.dta" from step 2 and generates the main table with EU-SILC-based results in the manuscript and the appendix (Table 6 in the main text and Appendix Table 3) and an output file named "balanced-data.dta", which can be used to construct Table 1 in the main text and Tables 1 and 2 published in the Appendix (see next code file).  

- Step 4. The script "4-Analysis-part2.do" generates the remaining EU-SILC-based results in the manuscript and the appendix, namely Table 1 in the main text and Tables 1 and 2 from the Appendix. This code produces an output file named "sample.csv", which can be used for the simulation tables.

###########      Results based on Simulation data      #############    

The simulations were performed in Julia and the following packages will be needed:  Distributions, Optim, Plots, ProgressMeter, Weave CSV, Tables, StatsBase, Distributed, SharedArrays, ParallelUtilities. Specific version numbers are given at the end of this readme file.

The user will need to change the path inside each file to open the dataset.

- The script "5-Simulation-part1.jl" generates Tables 2 and 3 in the main text. Please change `n` (lines 25 and 26) to replicate all the lines on both tables. It uses file "sample.csv", generated in Step 4 above, using file "4-Analysis-part2.do".  

- The script "6-Simulation-part2.jl" generates Table 4. Please change `n` to replicate all the Table's lines. It uses file "sample.csv", generated in file "4-Analysis-part2.do".  

- The script "7-Simulation-part3.jl" generates Table 5. Please change `delta` to replicate all the Table's lines. It uses file "sample.csv", generated in file "4-Analysis-part2.do".  

###########  Data EU-SILC variable description      ############### 

Due to the proprietary nature of the EU-SILC data, obtained from Eurostat, we can not provide the output dataset produced in step 1 ("dataset.dta").

We now provide a description of the variables in that dataset.

*var1 i       : unique identifier
*var2 year   : year of the interview
*var3 country   : country of the individual, numeric
*var4 rb020   : country of the individual, string
*var5 health   : health information
*var6 married   : married status of the individual
*var7 child   : number of children of the individual
*var8 emp   : employment status of the individual
*var9 ltdhi : log of the total disposable household equivalized income
*var10 urbanisation : degree of urbanisation of the place where the household lives
*var11 male: gender
*var12 ageg: age groups
*var13 educ6: education groups

// =========================================================================
// Variables explanation: description of the details
// =========================================================================

*var1 i       : unique identifier

*Ids are modified relative to original EU-SILC ids as follows:
*egen string c_pid = concat(country k pid), format(%60.0g)
*generate c_pid_n=c_pid
*egen i=group(c_pid_n yearl)

*var2 year   : year of the interview
*2003-2016

*var3 country   : country of the individual, numeric
*var4 rb020   : country of the individual, string
*var5 health   : health information
*genhealthlabel 1 "Very good"; 2 "Good"; 3 "Fair"; 4 "Bad"; 5 "Very bad"

*var6 married   : married status of the individual
*marriedlabel 0 "Not married"; 1 "Married"

*var7 child   : number of children of the individual
*childlabel 0 "0" 1 "1" 2 "2" 3 ">=3"

*var8 emp   : employment status of the individual
/* original variable
1 Employee working full-time
2 Employee working part-time
3 Self-employed working full-time (including family worker)
4 Self-employed working part-time (including family worker)
5 Unemployed
6 Pupil, student, further training, unpaid work experience
7 In retirement or in early retirement or has given up business
8 Permanently disabled or/and unfit to work
9 In compulsory military or community service
10 Fulfilling domestic tasks and care responsibilities
11 Other inactive person
*/

/* 9 categories, consistent (same in all time periods)

*emp9label 1 "Full-time" 2 "Part-time" ///
3 "Unemployed" 4 "Pupil, student, further training, unpaid work experience" ///
5 "In retirement or in early retirement or has given up business" ///
6 "Permanently disabled or/and unfit to work" ///
7 "In compulsory military or community service" ///
8 "Fulfilling domestic tasks and care responsibilities" ///
9 "Other inactive person" ///
. "missing information employment status"
*/

/*variable emp has 4 consistent categories:
emp4label
1 "Full-time or Part-time" ///
2 "Unemployed"
3 "Pupil, student, further training, unpaid work experience,  ///
In compulsory military or community service. Fulfilling domestic ///
tasks and care responsibilities" ///
4 "In retirement or in early retirement or has given up business ///
or Permanently disabled or/and unfit to work or Other inactive person" ///
. "missing information employment status"

*var9  ltdhi "Log of the Total disp household equivalized income"

*var10 urbanisation "Urbanisation (1 highest, 3 lowest)"

*var11 male label 0 "Female" 1 "Male"

*var12 ageg (age groups): 1 "[18-25]" 2 "]25-35]" 3 "]35-45]" 4 "]45-55]" 5 "]55-65]" 6 ]65 or more[

*var13 educ6 "Education (1 lowest, 5 highest)"


###########       Further notes                      ############### 

Folder structure 
-----------------

The folder structure of the replication file is as follows:

- Code: 7 files with script code placed in the main folder. 3 auxiliary files to be used in script "3" "feol-fedoc-likelihoods.R", "auxiliary-MRV.R", "fedol-func-emp.R".

- Data: Contains the data files underlying the analysis (see above for availability status). Place in subfolder "data" the following: original files obtained directly from Eurostat, the file that is the result from script "1" "dataset.dta" and "dataset.csv", plus the file uploaded in the repository with the deflator information: "deflators.csv" in subfolder "/data".

- Output: The main folder will contain all the raw result tables, as well as further statistics produced during the analysis (e.g. standard deviations).

As indicated above, it is advisable to preserve this folder structure for replication of results, because code scripts rely on these paths for raw data fetches and export of results.


Software 
--------

The results have been generated using Stata 17, R-versions 4.1.2. platform x86_64-apple-darwin17.0 (RStudio 2022.07.2 Build 576). Full package 

configuration used for each script to help troubleshoot potential future compatibility changes due to amendments in package codes: 
cli 3.4.1
skimr 2.1.4
tidyverse 1.3.1
fastDummies 1.6.3
pracma 2.3.8
furrr 0.3.0
openxlsx 4.2.5.1

Julia-1.8. Full package 

configuration used for each script to help troubleshoot potential future compatibility changes due to amendments in package codes: 
Distributions v0.25.75,
Optim v1.7.3,
Plots v1.32.0, 
ProgressMeter v1.7.2, 
Weave v0.10.10,
CSV v0.10.4, 
Tables v1.9.0, 
StatsBase v0.33.21,  
ParallelUtilities v0.8.6.
The operating system used was MAC-OS 11.6.


######################################################## 
######################################################## 